Goto

Collaborating Authors

 music video


Taylor Swift fans flock to German museum to see Ophelia painting

BBC News

Taylor Swift fans are driving a surge in popularity of a German museum exhibiting a portrait of the Shakespeare character Ophelia, recently reimagined in a song and music video from Swift's new album The Life of a Showgirl. The Hessische Landesmuseum in the central German city of Wiesbaden saw hundreds more visitors than usual over the weekend, as fans hoped to see the real version of the painting that opens the music video for The Fate of Ophelia. In the video, viewed more than 65 million times on Youtube, the painting comes alive, with Swift at its centre. We're really enjoying this attention - it's a lot of fun, museum spokesperson Susanne Hirschmann told the Associated Press. Hirschmann said that one family had travelled from the northern city of Hamburg, a five-hour drive away, while some of the visitors were Americans from an army base nearby.


'Meteor' streaks through Britain's skies tonight leaving lucky gazers in awe

Daily Mail - Science & tech

Charlie Kirk leaked text confirms he was livid about'bullying' Jewish donors: 'I'm leaving pro-Israel cause' White House insider who says WAR with Venezuela is inevitable... as Trump's lethal options are laid out I've seen the real Victoria Beckham... her actions gave me PTSD, she shunned me and even banned me from glancing in her direction. Jimmy Kimmel's audience boom comes crashing down as he loses 71% of viewers in one week'Kissing Trump's a**': President mocks Canada's obsequious PM as he begs for tariff relief World's most invasive predator terrorizing East Coast is delicious and should be eaten to stop its spread, experts say I've had enough of the arrogant and entitled fat brigade. Bloodcurdling videos shows girl aged 12 subway surfing days before she and friend, 13, died during 3.10am stunt Another blow for Prince Harry as African country cuts ties with his'disrespectful' charity Friends fear for new CBS News boss Bari Weiss, claiming her wife thinks she sold out... and her new job will'consume her life' Keith Urban's guitarist Maggie once vowed to'never' date a tour mate... as she's accused of charming Nicole Kidman's ex Hollywood's favorite muscle car primed for return as America's No.1 automaker files secret paperwork AMANDA PLATELL: I never thought I'd feel sorry for Harry. There's one thing he'd do anything to defend... and now Meghan's trampled all over it Ben Affleck's VERY familiar whispers to Jennifer Lopez on the red carpet revealed... as their romantic new era sends fans into overdrive Jimmy Kimmel continues anti-Trump rants and says he's more popular with Americans than the president Brits have been left in awe after spotting what is believed to be a'meteor' glowing through the night sky. Lucky stargazers in Northfields and West Ealing, west London, have reported seeing a blue-ish green blob race through the city's sky tonight.


How Taylor Swift is helping botany gain celebrity status

New Scientist

Feedback is delighted to learn that researchers have discovered what Taylor Swift is accidentally doing to rescue the science of plants from mid-ness. We never miss a beat, so Feedback, prompted by assistant news editor and Swiftie Alexandra Thompson, has been taking a close look at a major paper in the Annals of Botany, published in August. It is called "Dance with plants: Taylor Swift's music videos as advance organizers for meaningful learning in botany" . The thesis is that high school students exhibit "a general low interest in plants", leading to "plant blindness". Teachers struggling to convey the magic of botany are repeating material and are getting sick of it.


From Sound to Sight: Towards AI-authored Music Videos

Vitasovic, Leo, Graßhof, Stella, Kloft, Agnes Mercedes, Lehtola, Ville V., Cunneen, Martin, Starostka, Justyna, McGarry, Glenn, Li, Kun, Brandt, Sami S.

arXiv.org Artificial Intelligence

Conventional music visualisation systems rely on handcrafted ad hoc transformations of shapes and colours that offer only limited expressiveness. We propose two novel pipelines for automatically generating music videos from any user-specified, vocal or instrumental song using off-the-shelf deep learning models. Inspired by the manual workflows of music video producers, we experiment on how well latent feature-based techniques can analyse audio to detect musical qualities, such as emotional cues and instrumental patterns, and distil them into textual scene descriptions using a language model. Next, we employ a generative model to produce the corresponding video clips. To assess the generated videos, we identify several critical aspects and design and conduct a preliminary user evaluation that demonstrates storytelling potential, visual coherency and emotional alignment with the music. Our findings underscore the potential of latent feature techniques and deep generative models to expand music visualisation beyond traditional approaches.


Secure & Personalized Music-to-Video Generation via CHARCHA

Agarwal, Mehul, Agarwal, Gauri, Benoit, Santiago, Lippman, Andrew, Oh, Jean

arXiv.org Artificial Intelligence

Music is a deeply personal experience and our aim is to enhance this with a fullyautomated pipeline for personalized music video generation. Our work allows listeners to not just be consumers but co-creators in the music video generation process by creating personalized, consistent and context-driven visuals based on lyrics, rhythm and emotion in the music. The pipeline combines multimodal translation and generation techniques and utilizes low-rank adaptation on listeners' images to create immersive music videos that reflect both the music and the individual. To ensure the ethical use of users' identity, we also introduce CHARCHA, a facial identity verification protocol that protects people against unauthorized use of their face while at the same time collecting authorized images from users for personalizing their videos. This paper thus provides a secure and innovative framework for creating deeply personalized music videos. Figure 1: Image stills and lyrics from generated music videos for Rick Astley's "Never Gonna Give You Up," with character reference from CHARCHA. The videos use Queratogray Sketch[1], Western Animation Diffusion[2], and Realistic Vision V5.1[3] checkpoint models .


MuseChat: A Conversational Music Recommendation System for Videos

Dong, Zhikang, Chen, Bin, Liu, Xiulong, Polak, Pawel, Zhang, Peng

arXiv.org Artificial Intelligence

Music recommendation for videos attracts growing interest in multi-modal research. However, existing systems focus primarily on content compatibility, often ignoring the users' preferences. Their inability to interact with users for further refinements or to provide explanations leads to a less satisfying experience. We address these issues with MuseChat, a first-of-its-kind dialogue-based recommendation system that personalizes music suggestions for videos. Our system consists of two key functionalities with associated modules: recommendation and reasoning. The recommendation module takes a video along with optional information including previous suggested music and user's preference as inputs and retrieves an appropriate music matching the context. The reasoning module, equipped with the power of Large Language Model (Vicuna-7B) and extended to multi-modal inputs, is able to provide reasonable explanation for the recommended music. To evaluate the effectiveness of MuseChat, we build a large-scale dataset, conversational music recommendation for videos, that simulates a two-turn interaction between a user and a recommender based on accurate music track information. Experiment results show that MuseChat achieves significant improvements over existing video-based music retrieval methods as well as offers strong interpretability and interactability.


The Beatles' "Final" Music Video Is an Abomination

Slate

As the other members of the Beatles sing and play, Lennon, ever the cut-up, clowns around, bouncing from one leg to the other with a grin on his face. His hands move like flippers, turned out at an odd angle and making frantic circles in the air, as if he's wiping down an invisible window. And as his body moves from side to side, his head seems to lag slightly behind it. The larkish ebullience feels strained and off-kilter, like an audience that wants to clap along but can't find the beat. The music video for "Now and Then," which has been billed as "the last Beatles song," starts off as an affectionate nostalgia trip, intercutting present-day footage of the two surviving Beatles with archival footage of their late bandmates.


FiLM: Fill-in Language Models for Any-Order Generation

Shen, Tianxiao, Peng, Hao, Shen, Ruoqi, Fu, Yao, Harchaoui, Zaid, Choi, Yejin

arXiv.org Artificial Intelligence

Language models have become the backbone of today's AI systems. However, their predominant left-to-right generation limits the use of bidirectional context, which is essential for tasks that involve filling text in the middle. We propose the Fill-in Language Model (FiLM), a new language modeling approach that allows for flexible generation at any position without adhering to a specific generation order. Its training extends the masked language modeling objective by adopting varying mask probabilities sampled from the Beta distribution to enhance the generative capabilities of FiLM. During inference, FiLM can seamlessly insert missing phrases, sentences, or paragraphs, ensuring that the outputs are fluent and are coherent with the surrounding context. In both automatic and human evaluations, FiLM outperforms existing infilling methods that rely on left-to-right language models trained on rearranged text segments. FiLM is easy to implement and can be either trained from scratch or fine-tuned from a left-to-right language model. Notably, as the model size grows, FiLM's perplexity approaches that of strong left-to-right language models of similar sizes, indicating FiLM's scalability and potential as a large language model.


The Origin Story of "Stop Making Sense"

The New Yorker

When it first opened in theatres, in the fall of 1984, "Stop Making Sense," directed by Jonathan Demme and starring the rock group Talking Heads, was quickly recognized as one of the finest concert films ever made. Reviewer after reviewer settled on the word "exhilarating" to describe the experience of watching an expanded nine-member iteration of the four-piece group perform sixteen of their best-known songs in an uninterrupted sequence of dynamically staged and photographed musical vignettes. In the pages of this magazine, Pauline Kael praised the film as "close to perfection," and described the Heads front man, David Byrne, as "a stupefying performer." "He's so white he's almost mock-white," Kael wrote, "and so are his jerky, long-necked, mechanical-man movements. He seems fleshless, bloodless; he might almost be a Black man's parody of how a clean-cut white man moves. But Byrne himself is the parodist, and he commands the stage by his hollow-eyed, frosty verve."


Generative Disco: Text-to-Video Generation for Music Visualization

Liu, Vivian, Long, Tao, Raw, Nathan, Chilton, Lydia

arXiv.org Artificial Intelligence

Visuals can enhance our experience of music, owing to the way they can amplify the emotions and messages conveyed within it. However, creating music visualization is a complex, time-consuming, and resource-intensive process. We introduce Generative Disco, a generative AI system that helps generate music visualizations with large language models and text-to-video generation. The system helps users visualize music in intervals by finding prompts to describe the images that intervals start and end on and interpolating between them to the beat of the music. We introduce design patterns for improving these generated videos: transitions, which express shifts in color, time, subject, or style, and holds, which help focus the video on subjects. A study with professionals showed that transitions and holds were a highly expressive framework that enabled them to build coherent visual narratives. We conclude on the generalizability of these patterns and the potential of generated video for creative professionals.